物理重新安排的物体是体现剂的重要功能。视觉室的重新安排评估了代理在房间中重新安排对象的能力,仅基于视觉输入而获得所需的目标。我们为此问题提出了一种简单而有效的方法:(1)搜索并映射需要重新排列哪些对象,以及(2)重新排列每个对象,直到任务完成为止。我们的方法包括一个现成的语义分割模型,基于体素的语义图和语义搜索策略,以有效地找到需要重新排列的对象。在AI2 - 重新排列的挑战中,我们的方法改进了当前最新的端到端增强学习方法,这些方法从0.53%的正确重排达到16.56%,学习视觉重排政策,仅使用2.7%,仅使用2.7%来自环境的样本。
translated by 谷歌翻译
时间上解耦政策的层次结构提出了一种有希望的方法,可以在复杂的长期计划问题中实现结构化探索。为了完全实现这种方法,需要一种端到端的培训范式。然而,由于在层次结构中的目标分配和目标级别之间的相互作用,挑战,这些多级政策的培训已经有限。在本文中,我们将策略优化过程视为多智能agence过程。这使我们能够借鉴多代理RL的沟通与合作之间的联系,并展示了对整个政策培训绩效的子政策之间增加的合作的好处。通过修改目标函数和后续级别的更高级别政策,我们介绍了一种简单但有效的技术,可以通过修改目标函数和随后的渐变来诱导级别合作。关于各种模拟机器人和交通管制任务的实验结果表明,诱导合作导致更强大的表现,并提高了一套艰难的长时间地平任务的样本效率。我们还发现使用我们的方法训练的目标条件调节政策显示更好地转移到新任务,突出了我们在学习任务不可行的较低级别行为方面的方法的好处。视频和代码可在:https://sites.google.com/berkeley.edu/cooperative-hrl。
translated by 谷歌翻译
Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embedding in NeRF renderings. We design an optimization framework allowing accurate hidden information extractions from images rendered by NeRF, while preserving its original visual quality. We perform experimental evaluations of our method under several potential deployment scenarios, and we further discuss the insights discovered through our analysis. StegaNeRF signifies an initial exploration into the novel problem of instilling customizable, imperceptible, and recoverable information to NeRF renderings, with minimal impact to rendered images. Project page: https://xggnet.github.io/StegaNeRF/.
translated by 谷歌翻译
We study fair multi-objective reinforcement learning in which an agent must learn a policy that simultaneously achieves high reward on multiple dimensions of a vector-valued reward. Motivated by the fair resource allocation literature, we model this as an expected welfare maximization problem, for some non-linear fair welfare function of the vector of long-term cumulative rewards. One canonical example of such a function is the Nash Social Welfare, or geometric mean, the log transform of which is also known as the Proportional Fairness objective. We show that even approximately optimal optimization of the expected Nash Social Welfare is computationally intractable even in the tabular case. Nevertheless, we provide a novel adaptation of Q-learning that combines non-linear scalarized learning updates and non-stationary action selection to learn effective policies for optimizing nonlinear welfare functions. We show that our algorithm is provably convergent, and we demonstrate experimentally that our approach outperforms techniques based on linear scalarization, mixtures of optimal linear scalarizations, or stationary action selection for the Nash Social Welfare Objective.
translated by 谷歌翻译
Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range of applications. A common task for many computational methods is the need to accurately compute the minimum binding energy - the adsorption energy - for an adsorbate and a catalyst surface of interest. Traditionally, the identification of low energy adsorbate-surface configurations relies on heuristic methods and researcher intuition. As the desire to perform high-throughput screening increases, it becomes challenging to use heuristics and intuition alone. In this paper, we demonstrate machine learning potentials can be leveraged to identify low energy adsorbate-surface configurations more accurately and efficiently. Our algorithm provides a spectrum of trade-offs between accuracy and efficiency, with one balanced option finding the lowest energy configuration, within a 0.1 eV threshold, 86.63% of the time, while achieving a 1387x speedup in computation. To standardize benchmarking, we introduce the Open Catalyst Dense dataset containing nearly 1,000 diverse surfaces and 87,045 unique configurations.
translated by 谷歌翻译
Artificial intelligence (AI) has enormous potential to improve Air Force pilot training by providing actionable feedback to pilot trainees on the quality of their maneuvers and enabling instructor-less flying familiarization for early-stage trainees in low-cost simulators. Historically, AI challenges consisting of data, problem descriptions, and example code have been critical to fueling AI breakthroughs. The Department of the Air Force-Massachusetts Institute of Technology AI Accelerator (DAF-MIT AI Accelerator) developed such an AI challenge using real-world Air Force flight simulator data. The Maneuver ID challenge assembled thousands of virtual reality simulator flight recordings collected by actual Air Force student pilots at Pilot Training Next (PTN). This dataset has been publicly released at Maneuver-ID.mit.edu and represents the first of its kind public release of USAF flight training data. Using this dataset, we have applied a variety of AI methods to separate "good" vs "bad" simulator data and categorize and characterize maneuvers. These data, algorithms, and software are being released as baselines of model performance for others to build upon to enable the AI ecosystem for flight simulator training.
translated by 谷歌翻译
Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is especially important for semantic segmentation tasks involving 3D datasets that are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on large unlabelled datasets is one way to reduce the amount of manual annotations needed. Previous work has focused on pre-training with point cloud data exclusively; this approach often requires two or more registered views. In the present work, we combine image and point cloud modalities, by first learning self-supervised image features and then using these features to train a 3D model. By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene. We demonstrate that our pre-training approach, despite using single scans, achieves comparable performance to other multi-scan, point cloud-only methods.
translated by 谷歌翻译
IMPORTANCE: An interpretable machine learning model can provide faithful explanations of each prediction and yet maintain higher performance than its black box counterpart. OBJECTIVE: To design an interpretable machine learning model which accurately predicts EEG protopatterns while providing an explanation of its predictions with assistance of a specialized GUI. To map the cEEG latent features to a 2D space in order to visualize the ictal-interictal-injury continuum and gain insight into its high-dimensional structure. DESIGN, SETTING, AND PARTICIPANTS: 50,697 50-second cEEG samples from 2,711 ICU patients collected between July 2006 and March 2020 at Massachusetts General Hospital. Samples were labeled as one of 6 EEG activities by domain experts, with 124 different experts providing annotations. MAIN OUTCOMES AND MEASURES: Our neural network is interpretable because it uses case-based reasoning: it compares a new EEG reading to a set of learned prototypical EEG samples from the training dataset. Interpretability was measured with task-specific neighborhood agreement statistics. Discriminatory performance was evaluated with AUROC and AUPRC. RESULTS: The model achieves AUROCs of 0.87, 0.93, 0.96, 0.92, 0.93, 0.80 for classes Seizure, LPD, GPD, LRDA, GRDA, Other respectively. This performance is statistically significantly higher than that of the corresponding uninterpretable (black box) model with p<0.0001. Videos of the ictal-interictal-injury continuum are provided. CONCLUSION AND RELEVANCE: Our interpretable model and GUI can act as a reference for practitioners who work with cEEG patterns. We can now better understand the relationships between different types of cEEG patterns. In the future, this system may allow for targeted intervention and training in clinical settings. It could also be used for re-confirming or providing additional information for diagnostics.
translated by 谷歌翻译
Transfer Learning is an area of statistics and machine learning research that seeks answers to the following question: how do we build successful learning algorithms when the data available for training our model is qualitatively different from the data we hope the model will perform well on? In this thesis, we focus on a specific area of Transfer Learning called label shift, also known as quantification. In quantification, the aforementioned discrepancy is isolated to a shift in the distribution of the response variable. In such a setting, accurately inferring the response variable's new distribution is both an important estimation task in its own right and a crucial step for ensuring that the learning algorithm can adapt to the new data. We make two contributions to this field. First, we present a new procedure called SELSE which estimates the shift in the response variable's distribution. Second, we prove that SELSE is semiparametric efficient among a large family of quantification algorithms, i.e., SELSE's normalized error has the smallest possible asymptotic variance matrix compared to any other algorithm in that family. This family includes nearly all existing algorithms, including ACC/PACC quantifiers and maximum likelihood based quantifiers such as EMQ and MLLS. Empirical experiments reveal that SELSE is competitive with, and in many cases outperforms, existing state-of-the-art quantification methods, and that this improvement is especially large when the number of test samples is far greater than the number of train samples.
translated by 谷歌翻译
概念诱导是基于正式的逻辑推理在描述逻辑上的,已在本体工程中使用,以从基本数据(ABOX)图创建本体(Tbox)公理。在本文中,我们表明它也可以用来解释数据差异,例如在可解释的AI(XAI)的背景下,我们表明它实际上可以以对人类观察者有意义的方式进行。我们的方法利用了从Wikipedia类别层次结构策划的大型层次结构,作为背景知识。
translated by 谷歌翻译